Secure computation with horizontally partitioned data using adaptive regression splines

نویسندگان

  • Joyee Ghosh
  • Jerome P. Reiter
  • Alan F. Karr
چکیده

When several data owners possess data on different records but the same variables, known as horizontally partitioned data, the owners can improve statistical inferences by sharing their data with each other. Often, however, the owners are unwilling or unable to share because the data are confidential or proprietary. Secure computation protocols enable the owners to compute parameter estimates for some statistical models, including linear regressions, without sharing individual records’ data. A drawback to these techniques is that the model must be specified in advance of initiating the protocol, and the usual exploratory strategies for determining goodfitting models have limited usefulness since the individual records are not shared. In this paper, we present a protocol for secure adaptive regression splines that allows for flexible, semi-automatic regression modeling. This reduces the risk of model misspecification inherent in secure computation settings. We illustrate the protocol with air pollution data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Secure computation with horizontally partitioned data using adaptive regressive splines

When several data owners possess data on different records but the same variables, known as horizontally partitioned data, the owners can improve statistical inferences by sharing their data with each other. Often, however, the owners are unwilling or unable to share because the data are confidential or proprietary. Secure computation protocols enable the owners to compute parameter estimates f...

متن کامل

Secure Bayesian model averaging for horizontally partitioned data

When multiple data owners possess records on different subjects with the same set of attributes—known as horizontally partitioned data—the data owners can improve analyses by concatenating their databases. However, concatenation of data may be infeasible because of confidentiality concerns. In such settings, the data owners can use secure computation techniques to obtain the results of certain ...

متن کامل

SMC Protocol for Naïve Bayes Classification over Grid Partitioned Data using Multiple UTPs

The case where data is distributed horizontally as well as vertically, it refers as grid partitioned data. SMC protocol for Naïve Bayes classification over grid partitioned data is offered in this paper. Also present a solution of the Secure Multi-party Computation (SMC) problem in the form of a protocol that preserves privacy. In this system, a protocol with several Un-trusted Third Parties (U...

متن کامل

The Privacy of k-NN Retrieval for Horizontal Partitioned Data -- New Methods and Applications

Recently, privacy issues have become important in clustering analysis, especially when data is horizontally partitioned over several parties. Associative queries are the core retrieval operation for many data mining algorithms, especially clustering and k-NN classification. The algorithms that efficiently support k-NN queries are of special interest. We show how to adapt well-known data structu...

متن کامل

ESTIMATING DRYING SHRINKAGE OF CONCRETE USING A MULTIVARIATE ADAPTIVE REGRESSION SPLINES APPROACH

In the present study, the multivariate adaptive regression splines (MARS) technique is employed to estimate the drying shrinkage of concrete. To this purpose, a very big database (RILEM Data Bank) from different experimental studies is used. Several effective parameters such as the age of onset of shrinkage measurement, age at start of drying, the ratio of the volume of the sample on its drying...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Statistics & Data Analysis

دوره 51  شماره 

صفحات  -

تاریخ انتشار 2007